eRFSVM: a hybrid classifier to predict enhancers-integrating random forests with support vector machines
نویسندگان
چکیده
BACKGROUND Enhancers are tissue specific distal regulation elements, playing vital roles in gene regulation and expression. The prediction and identification of enhancers are important but challenging issues for bioinformatics studies. Existing computational methods, mostly single classifiers, can only predict the transcriptional coactivator EP300 based enhancers and show low generalization performance. RESULTS We built a hybrid classifier called eRFSVM in this study, using random forests as a base classifier, and support vector machines as a main classifier. eRFSVM integrated two components as eRFSVM-ENCODE and eRFSVM-FANTOM5 with diverse features and labels. The base classifier trained datasets from a single tissue or cell with random forests. The main classifier made the final decision by support vector machines algorithm, with the predicting results of base classifiers as inputs. For eRFSVM-ENCODE, we trained datasets from cell lines including Gm12878, Hep, H1-hesc and Huvec, using ChIP-Seq datasets as features and EP300 based enhancers as labels. We tested eRFSVM-ENCODE on K562 dataset, and resulted in a predicting precision of 83.69 %, which was much better than existing classifiers. For eRFSVM-FANTOM5, with enhancers identified by RNA in FANTOM5 project as labels, the precision, recall, F-score and accuracy were 86.17 %, 36.06 %, 50.84 % and 93.38 % using eRFSVM, increasing 23.24 % (69.92 %), 97.05 % (18.30 %), 76.90 % (28.74 %), 4.69 % (89.20 %) than the existing algorithm, respectively. CONCLUSIONS All these results demonstrated that eRFSVM was a better classifier in predicting both EP300 based and FAMTOM5 RNAs based enhancers.
منابع مشابه
Modeling and Trading the EUR/USD Exchange Rate Using Machine Learning Techniques
The present paper aims in investigating the performance of state-of-the-art machine learning techniques in trading with the EUR/USD exchange rate at the ECB fixing. For this purpose, five supervised learning classification techniques (K-Nearest Neighbors algorithm, Naïve Bayesian Classifier, Artificial Neural Networks, Support Vector Machines and Random Forests) were applied in the problem of t...
متن کاملA Hybrid Random Forest based Support Vector Machine Classification supplemented by boosting
This paper presents an approach to classify remote sensed data using a hybrid classifier. Random forest, Support Vector machines and boosting methods are used to build the said hybrid classifier. The central idea is to subdivide the input data set into smaller subsets and classify individual subsets. The individual subset classification is done using support vector machines classifier. Boosting...
متن کاملAnomaly Detection Using SVM as Classifier and Decision Tree for Optimizing Feature Vectors
Abstract- With the advancement and development of computer network technologies, the way for intruders has become smoother; therefore, to detect threats and attacks, the importance of intrusion detection systems (IDS) as one of the key elements of security is increasing. One of the challenges of intrusion detection systems is managing of the large amount of network traffic features. Removing un...
متن کاملFault Detection and Classification in Double-Circuit Transmission Line in Presence of TCSC Using Hybrid Intelligent Method
In this paper, an effective method for fault detection and classification in a double-circuit transmission line compensated with TCSC is proposed. The mutual coupling of parallel transmission lines and presence of TCSC affect the frequency content of the input signal of a distance relay and hence fault detection and fault classification face some challenges. One of the most effective methods fo...
متن کاملFault diagnosis in a distillation column using a support vector machine based classifier
Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...
متن کامل